freenode_mathfandomcom-20200215-history
Compactness and Applications
Seminar I guess I'll begin; feel free to interrupt with a ! as per usual. Typically people consider 'model theory' to be the same as 'the model theory of first-order logic', at least in informal discussion. Since I'm not exactly an expert on the situation, that's what I'll consider to be model theory too. :) I don't want to dig too deep into why, but I think it bears mentioning why first-order logic is special. First, to be clear, first-order logic is what happens when logical statements are allowed to have the quantifiers (x P(x)) "for all x, P(x)" and (Ex P(x)) "exists x, P(x)". along with the other logical connectives (^ "and", v "or", - "not") and so on. ! Yeah, maxote and more as '->', '<->', exor, ... Yes; usually we just care about the "minimal set" ((x), ^, and -) because everything else can be defined in terms of those three. (actually, you can do better, ((x), and nand), but that's annoying). The other major contestant for mathematical logical system is second-order logic, which allows quantifying over subsets of a set. So there's a quantifier for "every subset of X, P(x)", "exists a subset of X, P(x)", and so on. It's much more expressive than first-order logic. The reason first-order logic is considered somewhat more desirable is that second-order logic has two non-equivalent ways of interpreting it, both of which have desirable propertites. By contrast, first-order logic only really has the standard interpretation, and this interpretation has several very nice properties, including "completeness" and "compactness" which is what I wanted to talk about today. So if all goes well we should get through proving compactness, then completeness, and then the five easy examples of using compactness. Any questions? (none) Alright. To begin with, I need to point out (and few people do this, I think), that when I'm talking in general, my statements are in ZFC. I don't actually need the axiom of choice; we can get by with a simpler version called the ultrafilter lemma, but choice makes life easier. So let's begin with the definition of a language: Def'n: A language is a set of symbols, each with a type (constant, function, relation) and an arity (= number of arguments). Example: The language of groups is (0, +, -), where the second is a binary operator, and the third is a unary operator. As a matter of convention I assume that equality is given to all models; it's sort of a philosophical problem whether or not you accept equality as having a unique interpretation... Anyway, with this language we can write tonnes of sentences in the language of groups; for instance, (Ex (y (xy = y ^ y = yx))) asserts the existence of identities. ... that should be (y (Ex (xy = y ^ y = yx))), my quantifiers are backwards. ! * ChanServ has changed the topic to: Seminar in progress. If you have a question, say '!' and wait to be called. Another example is the language of graphs, ® a binary relation that is true when two nodes are linked. maxote is it propositional second-order logic or predicated second-order logic? maxote: neither, first-order predicate calculus. ! vixey what's a unique interpretation (wrt equality)? vixey: we need to explain models to explain that. hold a sec :) Now, with a language you can write lots of statements. Sets of statements are called theories. Actually, the kind of theories you write down are usually called axiomatizations. ! So group theory has a set of axioms for associativity, inverses, and identites. maxote what mean the uppercased symbols and the lowercased symbols? are they variables and functors/constants? or viceversa? Ah, sorry. The symbols don't mean anything, they're pure notation. i need to distinguish variables versus constants I'll always use x, y as variables. Take a, b, c, d, e to be constants, f, g, ... k to be functions, and R, S, ... V to be relations. okay Sorry for the confusion. Anyway, the moral of the story was that theories are just syntax, sets of sentences. We'll assume some sort of proof theory that we'll never need to explicitly touch so long as it gives us a way of knowing when a theory can deduce, syntactically, other sentences. This is summed up with a symbol |-, the turnstile, which means deduction. So if A is a theory and s is a sentence, A |- s means we put A into our proof machine and s pops out. We can also say that if A and B are theories, A |- B if we put A into our proof machine and every sentence of B pops out. Now, we need to talk about models. Def'n: A model of a language L is a set together with 'interpretations' of the various symbols in L. I.e., it has a function from L to its underlying set that maps constants to elements of the model, functions to functions of sets, and relations to relations of sets. Example: {0, 1} is a model of the language of groups when 0 -> 0, + -> addition mod 2, and - -> inverses mod 2. Note that a model of a language doesn't have to satisfy any axioms, it just has to have the required interpretations. Now, we have a semantic relationship between models and sentences, in that a sentence is true within a model when, once you substitute symbols for interpretations, the resulting sentence is true in ZFC. I mean substitute interpretations for symbols. This relationship is called consequence, and has the symbol |= So if M is a model and t is a sentence, we write M |= t To make matters worse, if M is a model and A is a theory, we can write M |= A And to make matters even worse, if A is a theory, we can write A |= t iff every M such that M |= A is such that M |= t. That's to say, A |= t iff every model of A is a model of t. This operator overloading is a pain in the ass, and very confusing. So we have these two model-theoretical relationships, one syntactic and one semantic, |- and |= Back in the early days, Hilbert wanted to show that everything that was logically true was true in every model, and what was true in every model was also logically deduceable. That's called "completeness"; in symbols, A |- t iff A |= t. Godel proves it by simplifying propositions down to a certain form that can be verified to have models using his weird numbering scheme Then a guy named Leon Henkin found a way to construct models by making certain larger languages that had more constants than needed. But the modern proof works by reducing this to the compactness theorem, which I'll state shortly. This really is the most elementary way of proving completeness, as far as I'm aware; unfortunately it requires understanding ultrafilters :/ So the compactness theorem looks pretty trivial: ! maxote i don't think A |- t iff A |= t , i think valid "if A |- t then A |= t" but not valid "if A |= t then A |- t" I don't follow The completeness theorem says that everything that can be proven can be deduced, and vice versa. A lot of logical systems don't have completeness, but first-order logic with does. does that answer your questio? *question? * pengrate (n=ajcave@S0106000129681146.ed.shawcable.net) has left #mathematics ("Leaving") i dude Oh, vixey: When I said that equality had a unique interpretation earlier, I meant that it means equality as interpreted as objects in the model. It's not the case that theories supply a binary symbol '=' that models have to supply an interpretation for, they get it for free from set theory. is there anything else I need to clear up before compactness? Alright, I guess. Theorem (Compactness): Let A be a consistent theory such that every finite subset of A has a model. Then A has a model. Consistent here means that A doesn't deduce a contradiction, P ^ -P. ! maxote: e.g. succ(succ(pred(x))) |= succ(x) but i've not succ(succ(pred(x))) |- succ(x) , i've succ(pred(x)) |- x, it was the example that i wanted to say about my dude succ(succ(pred(x))) |- succ(x) follows from succ(pred(x)) |- x and Leibniz' rule So anyway For compactness we have all the finite subsets i of A, and for each one a whole slew of models M_i, with M_i |= i And we want a way to glue them together to get some *M such that *M |= A. It happens that the ultraproduct construction is the way to do this. First, a filter F over a set X is a set of subsets of X that satisfies three axioms: 1) X in F 2) A in F, B in F implies A ^ B in F 3) A in F, A <= B <= F implies B in F. An ultrafilter further means that for every subset S of X, either S in F or X \ S in F. We've seen these things in other seminars, I think, so I won't belabor them. ! thermoplyae what other seminars? I dunno, I thought I saw them somewhere Maybe I was hallucinating? not that i want you to belabor them carry on It's necessary to distinguish so-called principal ultrafilters from non-principal ultrafilters. Take X = {0, 1}; a good ultrafilter is We say that's "generated" by the set {0}, a singleton. Once you have a singleton, you know that the ultrafilter is just every subset containing that element. A non-principal ultrafilter isn't generated by a singleton. They're magical things whose existence is equivalent to the ultrafilter lemma: Every filter is contained in at least one ultrafilter. The standard filter for making non-principal ultrafilters is the Frechet filter F', a subset S of X is in F'(X) iff X \ S is finite. The ultrafilter containing F' is non-principal, provided that X is infinite. If X is finite, then there are no non-principal ultrafilters, sad us. The 'right way' to glue those models M_i together is to take their direct product, and then "mod out by a non-principal ultrafilter U", a bunch of jargon that I'll unpack here in a sec So the direct product of M_i as sets is going to be \Pi M_i Elements look like infinite vectors (m_1, m_2, ...) where m_i is from M_i Now take U to be a non-principal ultrafilter over M_i's indexing set, P(A) We say that (m_1, ...) ~ (n_1, ...) if the set {i : m_i = n_i} is in U. If you think of U as a complete set of "big enough" subsets of P(A), this just means that m and n agree on a "big enough" set of indices. (\Pi M_i) / U is then what happens when you quotient out by this equivalence relationship. I'll refer to (\Pi M_i) / U as *M to save keystrokes. The constants of *M are equivalence classes of the interpretation of constants in each of the submodels M_i. The functions and relations of *M are then defined in a similar way. One of the truly deep theorems about *M is Los' theorem, which says that the statements true in a big enough set of the submodels is true in *M, and vice versa. This is fantastically weird, because we've done a really strange, possibly requiring large inaccessible cardials construction ! maxote what's direct product of M_i? the elements of that direct product are lists of elements of each M_i. (m_1, m_2, m_3, ...), but that list is possibly uncountable. so, M_i is finite or infinite? the indexing set is a power set of a countable set, so see the continuum hypothesis. Oh, I mean infinite. It's not the power set, either, it's only the finite subsets. Sorry if this is confusing, it's 0500 here :( To say again; the indexing set for the direct product M_i is the set of all finite subsets of the consistent theory A. The premise of the compactness theorem gives us a model M_i for each finite subset i of A. Then we glue them together into the ultraproduct, and Los' theorem tells us that the theory of *M is the theory of each of the little M_i's glued together. To wrap this up (as I've got to go to work soon) there are five easy applications of compactness. 1) if T has arbitrarily large finite models, T has an infinite model. If T has arbitrarily large models, then it can deduce the proposition "T has 3 elements", "T has 4 elements", ... Since it has arbitrarily large finite models, you can take any finite subset of T and those propositions and satisfy them. So by compactness, you can satisfy all of them. 2) If every finite subgraph of a graph is n-colorable, then the whole graph is n-colorable (Four-Color Theorem) As it happens, n-colorability is a first-order proposition. ! maxote how is large one model? it depends For finite cardinalities, the axioms of a system can determine the size of the model You can have an axiom that says "T has exactly two elements" (Ex (Ey (x != y)) sorry, that's "has at least two elements" Anyway, for infinite cardinals, because first-order logic only quantifies over elements, you can only say "T has a countably infinite number of elements" But then models with uncountably infinite cardinality will also satisfy it. ! yeah what's the cardinality of a set of subsets? e.g. a set that has 2 subsets where one subset has trillions of elements and another thousands of elements. For finite sets, 2^|X|, for infinite sets, see continuum hypothesis. Those are maximums, anyway. Obviously the smallest a set of subsets can be is zero. 3) Every first-order sentence true in fields of characteristic zero is true in fields of sufficiently large characteristic. (Abraham Robinson) So if t is true in Q, it only "uses" the existence of a finite number, since proofs are finite So fields of sufficiently large characteristic will have all the numbers that t uses in its proof, end of game. 4) Every injective function C^n -> C^n is surjective (James Ax) The finite subtheories here are the theory of algebraically closed fields of prime characteristic. for fixed n, the theory of (ACF_p)^n approaches the theory of C^n as p becomes large. ACF_p has so much structure that the theorem holds for all them. and 5) There exist non-standard models of R with infinitesimal elements. (Robinson, again) look at R^\omega, it looks like (1, 1/2, 1/3, 1/4, ...) eh, scratch that the simpler way is to adjoin the axioms "there exists x such that 0 < x < 1/n" for each natural n * madmath has quit ("Ex-Chat") each finite subtheory is satisfied by R by density so the whole shebang is satisfied by compactness, and there exists x such that for all n, 0 < x < 1/n. ! That's *R for you, and a truly misspent undergraduate career. maxote In 2) , for me, not every finite subgraph is n-colorable implies its supergraph is n-colorable. Eg. subgraph of 1 node and 1-colorable, supergraph of 2 adjacent nodes is not 1-colorable. That's beyond the scope of the theorem. It's not an iff, it's an if-then. then it will be inmho 2) If the whole graph is n-colorable, then every finite subgraph of a graph is n-colorable That's also true by the converse of compactness. So that's the end of what I had planned for today Typing is much longer than talking >:( If there are more questions, I'll stick around for a bit but otherwise I'm getting ready for work Category:Seminar Category:Compactness